# Blogger to Hexo conversion

**WARNING:** The instructions provided here are still work in progress (you can say incomplete and not tested). So they can change in future. If you follow this process, follow at your own risk and certainly take backups if you do so. Do NOT follow these instructions to apply change on any production/live site. I am not responsible for any damage.

I created this repo to explain how I converted my Blogger blog to a more open, parseable and manageable static site generator [Hexo](https://hexo.io/).


## Step 1: Export data from Blogger

- First I logged in to Blogger dashboard, went to my blog.
- Navigated to Settings -> Other
- Under **Import & backup**, clicked **Backup content** then **Save to computer**.

This gave me a `blog-dd-mm-yyyy.xml` file. [Ref][1]


## Step 2: Import into WordPress

- Setup a WordPress installation as usual, blank install+up-to-date recommended
- **Trash** (delete) the test pages and posts that WordPress creates and then delete them permanently from Trash (otherwise it will get included into the export we're going to do later)
- From WP Admin, go to Tools -> Import
- Click **Install Now** under **Blogger** importer
- When it completes, click **Run Importer** under **Blogger** importer
- Choose the `.xml` file you exported earlier, click **Upload file and import** and continue with on screen instructions to complete the import


## Step 3: Change Categories to tags

Blogger does not have Categories or Tags. It has "Labels". So the Blogger importer in WP made a decision to make labels to categories. Hexo however, does not have a good time with this strategy. It places categories inside one another and makes a mess of the interface.

This is because Hexo by default sees multiple categories as "hierarchical categories" for some reason. e.g. the front matter of posts with multiple categories convert into something like:

```
categories:
  - Movies
  - Cars
  - Food
```

Hexo sees them as Movies inside Cars inside Food, which is weird. There is a thing called category grouping:

```
categories:
  - [Movies, Cars, Food]
```

This does not help and it stays like before, putting category inside category.

So maybe it is better to convert those categories into tags. After all categories and tags have the same task, you click on them, then a list posts under that is shown on a page. It also stays semantically in line with labels. I went this way. Maybe you'd like to keep it as is. That's fine too.

If you'd like to go my way, then convert categories into tags:
- From WP Admin, go to Tools -> Import
- Click **Install Now** under **Categories and Tags Converter**
- When it completes, click **Run Importer** under **Categories and Tags Converter**
- Click **Categories to Tags**, click **Check All** and run the conversion.


## Step 3: Import external images into WordPress

Blogger has some issues with how it handles images. It puts something like this:

```
<a href="https://1.bp.blogspot.com/-CJNSp3KUFR8/XJMEpQ2bb0I/AAAAAAAACho/oDw643NqRU47i0HUQ1H_ryPFEjFCL1NrgCLcBGAs/s1600/01.json-wallpaper-response.png" style="margin-left: 1em; margin-right: 1em;"><img data-original-height="470" data-original-width="836" src="https://1.bp.blogspot.com/-CJNSp3KUFR8/XJMEpQ2bb0I/AAAAAAAACho/oDw643NqRU47i0HUQ1H_ryPFEjFCL1NrgCLcBGAs/s320/01.json-wallpaper-response.png" width="320" height="179" border="0"></a>
```

This looks seriously messed up. Plus, Blogger's long cdn links doesn't help it either.

One thing to note is that the `<a>` has the original big image and the `<img>` has the smaller image (if you've chosen to use a shrunk version from the Blogger's WYSIWYG editor to appear in the post body instead of the full size one). Notice the `.../s1600/...` on the a href and `s320` on the `img src`. It probably describes the dimensions.

We don't want the images to be coming from blogger. Because, if in future Blogger closes, it will die with those images. It is better to keep our images with our blog source code.

There is a plugin called **Auto Upload Images** to automatically upload external images into WordPress. If we run plugin with this mess of a code, the smaller image will be uploaded to WP. Because the plugin doesn't know the how Blogger inserts images. It will only look into `<img src` and import the smaller size images. So we'll have to replace the hole code above into something simple as this:

```
<img src="https://1.bp.blogspot.com/-CJNSp3KUFR8/XJMEpQ2bb0I/AAAAAAAACho/oDw643NqRU47i0HUQ1H_ryPFEjFCL1NrgCLcBGAs/s1600/01.json-wallpaper-response.png" />
```

We have taken the full size image url and put it in the `<img src`. Now the plugin will see the full size image (if we decide to use it).

Don't worry about putting an `<a>` around the image. It will automatically show a lightbox in Hexo with the bigger image when clicked, at least in the default theme. So we can get rid of it.

Do you want to do this for all images yourself? I think no. Luckily, I have written a plugin for this.

There is another issue. You will find that the slugs for the posts after import is different than what it was on Blogger. The WordPress Blogger Importer sets the slugs according to the post title, which may not be always the same as the slug you set on your Blogger blog post. [There is a fix](https://www.isitwp.com/move-from-blogger-to-wordpress-resolved/) which I have included in the plugin as well.

So, the plugin does 3 things:
1. replace image HTML code with their highest resolution (explained above);
2. update slug to be same as Blogger;
3. replace Blogger read more anchor link to `<!--more-->` for WordPress.

To use the import fix plugin:

- Copy the `wp-blogger-import-fix` folder into `wp-content/plugins`
- Make sure you have increased maximum execution time in your php.ini, then restart Apache service. You may have to guess how much you would need to increase it based on how many posts you have. I had `120` set.
- From `wp-admin` activate the `blogger-import-fix` plugin. It will start converting code automatically.

WordPress importer automatically imports images into `wp-content/uploads` and changes the urls in the post body. If for some reason, some images did not import (which happens), it will be imported with the conversion script we'll use later. If you want to upload those images into WordPress, you can check out `wp-auto-upload-images.md`.


## Step 4: Convert into Hexo

- Export your WordPress blog with Tools -> Export
- Choose **All content**, then click **Download Export File**.

Hexo migration plugin will later parse this file and create posts based on it. This will give you a `yourblogname.WordPress.yyyy-mm-dd.xml`

- Create your hexo site as usually (if not already done)

```
npm install -g hexo-cli
hexo init hexotest
cd hexotest
npm install
```

- Then install the migration plugin

```
npm install hexo-migrator-wordpress --save
npm install --save xml2js
npm install --save turndown
npm install --save request
```

I set [`post_asset_folder`][2] to `true` on my `_config.yml`:

```
post_asset_folder: true
```

This will create folders for each post for assets (like images) to be put inside it. I think this is better instead of having one `images` folder and dumping all images incosiderately.

Then I started the migration process: [(Doc)][4]

```
hexo migrate wordpress /path/to/yourblogname.WordPress.yyyy-mm-dd.xml
```

It imported posts one by one. At the end it said something like:
```
...
INFO  115 posts migrated.
```

A `hexo generate` then `hexo serve` should make your site available in `http://localhost:4000`

So now, we will upload the images into Hexo. I have a functioning script in `hexo-auto-upload-images` folder in this repo. You can copy the `auto-upload-images.js` file into your `<hexo root>/scripts` folder. Make sure you have `post_asset_folder: true` set in your `_config.yml` and install a relative path plugin with `npm i -s hexo-asset-link`. Then run `hexo generate`. It will go through all the .md files, download the linked images into the folder for your post and replace the old urls of the images with the images that have been downloaded.


## Step 5: Converting comments into Hexo

Don't forget the comments! Fortunately Blogger export file had comments as well. This has let the WP importer to import and attach the comments with the posts. So the good news is we have the comments in our own WP database and on our own servers. Now comes the task of getting them into Hexo.

I have looked into commenting solutions for Hexo. They include Disqus, barely known third party services that who knows when they will shutdown without notice and even some to use GitHub issues as comments! Disqus is closed source so I would prefer not to use it. GitHub is closed source (at least when I'm writing this) and issues are not supposed to be used as comments, so no!

A better balance between being open and being able to manage a service myself, I chose [HashOver](https://www.barkdull.org/software/hashover). I chose to use v2.0 despite of it being in the active development because it has database support. v1 is flat file and requires file permissions to be set to 0777 which I'm not a fan of.

- Download v2 (a.k.a. hashover-next) from: [https://github.com/jacobwb/hashover-next](https://github.com/jacobwb/hashover-next)
- Do the setup using [doc from here](https://docs.barkdull.org/hashover-v2/setup). Hashover next supports flat file formats such as xml and json, but you'll have to use MySQL because we'll need to run a script later that requires the setup to be MySQL.
- Add this to your head. You can change theme name if you want (I added it in `themes/landscape/layout/_partial/head.ejs`):
```
<% if (page.path !== 'index.html'){ %>
<link rel="stylesheet" type="text/css" href="http://domain.com/hashover/themes/default/comments.css">
<% } %>
```
- Add this somewhere in the body (I added it in `themes/landscape/layout/_partial/article.ejs`):
```
<% if (!index && post.comments){ %>
<div id="hashover"></div>
<% } %>
```
- Add this to head or at the end of body (I placed it in `themes/landscape/layout/_partial/footer.ejs`):
```
<% if (page.path !== 'index.html'){ %>
<script type="text/javascript" src="http://domain.com/hashover/comments.php"></script>
<% } %>
```
- If hashover is not showing up it may be that you'd need to apply [some workaround](https://github.com/jacobwb/hashover-next/issues/281#issuecomment-612024133).
- Also, the HashOver docs is not clear enough, but setting from which domains you want to access the comments is crucial. (e.g. from GitLab pages) So you'll need to edit `hashover/config/settings.json` to something like this below. I had my Hexo site running in `http://localhost:4000`. So, I removed `http://` from the url and put it in the `allowed-domains` value. I also added `gitlab.io` since I plan to deploy my site in GitLab pages. (Although I would remove the `localhost:4000` entry from a real production install.) Also, as a part of transition, I want to show HashOver comments in my archived blogger blog, so I added it as well. No `.com` to let it pass on country specific domains, so that it also works on `xyz.blogspot.cn` etc. Something like this:
```
...
	"minifies-javascript": false,
	"minify-level": 1,
	"allowed-domains": ["localhost:4000", "mysite.gitlab.io", "myblogsubdomain.blogspot"]
}
```

- Also, make sure to set `data-format` to `sql` in the file.

If you ever see that the tables are not being created, you can use [an SQL Query](https://github.com/jacobwb/hashover-next/issues/247) available in the `hashover.sql` in this repo.

Generally, posting a test comment should create the tables and running the above sql should not be necessary.

But we'll also have to import in all the comments from WP. We have a php script called [wp2hashover](https://github.com/kepon85/wp2hashover).

Clone the repo or download the project. Copy `config.example.php` to `config.php` in wp2hashover to change your settings. All the settings are self explanatory. In case of "thread_syntax" values, use your `permalink` value from `_config.yml` and (a) replace the `/`s with - and (b) delete the last `/` from the end.

For example, if you have this on _config.yml:
```
permalink: :year/:month/:day/:title/
```
then use this on `config.php`:
```
$hashover_thread_syntax_posts=':year-:month-:day-:title';
$hashover_thread_syntax_pages=':title';
```

Now put it somewhere inside your PHP supported server and access the `wp2hashover.php`. It should automatically import all the comments from your WordPress db into hashover db. Look for the output on the page. If it shows an error, either try to find what may cause the issue or post an issue on wp2hashover project.


Ref:
[1]: https://support.google.com/blogger/answer/41387?hl=en
[2]: https://hexo.io/docs/asset-folders
[3]: https://dustinpfister.github.io/2018/01/03/hexo-plugins/
[4]: https://hexo.io/docs/migration.html